December 11, 2015

Data Description

Raw Data Spreadsheet

Raw data given as excel file with ~300 tabs like this:

Patient Data

Along with timeseries measurements, the following static covariates were given for each patient:

  • Age
  • Sex
  • GCS (Glasgow Coma Scale) - Integer score from 3 (deep unconsciousness) to 15 (normal)
  • Marshall Clasasification - Similar to GCS; Integer from 1 to 6 (lower is better)
  • Estimated Initial Time of Injury

As well as an outcome score called "GOS" measured at 3, 6, 12, and 24 months:

  • GOS - 1=Dead, 2-3=Severely Disabled, 4-5=Good Outcome

Data Preparation

For the sake of modeling, the following assumptions were made:

  • Only measurments within 48 hours of initial injury were considered
  • Patients with no GCS or Marshall score were excluded
  • Patients with less than 8 hours of PbtO2 measurements were also excluded
  • Only the 3 month GOS outcome was used, using the 6 month GOS where 3 month not available

There were 339 patients in raw data but only 268 in filtered dataset.

Note: Of 268 patients in filtered set, 18 had missing 3 month GOS

Descriptive Stats

Static Variable Distributions

Frequency of non-time-dependent values:

Comparing Variables to Outcome

Relationships between static variables and outcomes are pretty clear:

Blood Gas Measurements

Timeseries measurments for 4 random patients:

## Other Stuff

Binary GLM Model

Exhaustive GLM Model Results

Estimate Uncond. variance Nb models Importance +/- (alpha=0.05)
pha_0_7.35 -0.0372521 0.0069486 29 0.2454711 0.1641506
pao2_100_inf -0.0416354 0.0076650 35 0.2862554 0.1724048
paco2_0_35 -0.1039550 0.0256525 52 0.4475231 0.3153981
pbto2_0_20 -0.1391531 0.0323655 52 0.5358749 0.3542705
sex 0.1722509 0.0412554 52 0.5767763 0.3999763
marshall -0.2438273 0.0502777 65 0.7168432 0.4415518
paco2_45_inf 0.2923234 0.0487595 77 0.8269899 0.4348341
pha_7.45_inf -0.3193026 0.0514109 79 0.8379189 0.4464999
pbto2_100_inf -0.8610325 0.3929205 98 0.9895500 1.2343722
(Intercept) -1.5639168 0.0472386 100 1.0000000 0.4279987
age -0.7251374 0.0402352 100 1.0000000 0.3949996
gcs 0.5393797 0.0343400 100 1.0000000 0.3649167
icp1_20_inf -0.6818782 0.1017011 100 1.0000000 0.6279955
pao2_0_30 -0.7211150 0.1144418 100 1.0000000 0.6661716

Exhaustive Model Results

## Warning: package 'stringr' was built under R version 3.1.3

Nonlinear Modeling

A modified model:

\[ logit(y_i) = \alpha + \beta \cdot X_i + f(G_{ij}) \]

where

\[ X_i = [Gender_i, Age_i, CommaScore_i, MarshallScore_i] \]

and

\[ f(G_i) = \frac{1}{n_i} \sum_j{ \frac{c_1}{1 + e^{-c_2(G_{ij} - c_3)}} + \frac{c_4}{1 + e^{-c_5(G_{ij} - c_6)}} } \] \[ n_i = \text{ length of timeseries for patient }i \]

Double Logistic Function Examples

Random Functions

These are functions drawn from the priors in the model and show all possibilities:

Nonlinear Effects

Sample Size Effects (Fully Simulated Data)

Function Forms on Semi-Simulated Data

By semi-simulated, I mean by taking the real data and hard coding coefficient / function values.

Results - Intracranial Pressure levels

Results (PaCO2 levels)

Results (pH levels)

Results - PaO2 levels

Results - PbtO2 levels